[1]:
import emat
from emat.util.loggers import timing_log
emat.versions()
emat 0.5.2, plotly 4.14.3

Meta-Model Creation

To demostrate the creation of a meta-model, we will use the Road Test example model included with TMIP-EMAT. We will first create and run a design of experiments, to have some experimental data to define the meta-model.

[2]:
import emat.examples
scope, db, model = emat.examples.road_test()
design = model.design_experiments(design_name='lhs')
results = model.run_experiments(design)

We can then create a meta-model automatically from these experiments.

[3]:
from emat.model import create_metamodel

with timing_log("create metamodel"):
    mm = create_metamodel(scope, results, suppress_converge_warnings=True)
<TIME BEGINS> create metamodel
< TIME ENDS > create metamodel <11.14s>

If you are using the default meta-model regressor, as we are doing here, you can directly access a cross-validation method that uses the experimental data to evaluate the quality of the regression model. The cross_val_scores provides a measure of how well the meta-model predicts the experimental outcomes, similar to an R^2 measure on a linear regression model.

[4]:
with timing_log("crossvalidate metamodel"):
    display(mm.cross_val_scores())
<TIME BEGINS> crossvalidate metamodel
Cross Validation Score
no_build_travel_time 0.9908
build_travel_time 0.9917
time_savings 0.9125
value_of_time_savings 0.9113
net_benefits 0.6382
cost_of_capacity_expansion 0.8978
present_cost_expansion 0.9461
< TIME ENDS > crossvalidate metamodel <14.22s>

We can apply the meta-model directly on a new design of experiments, and use the contrast_experiments visualization tool to review how well the meta-model is replicating the underlying model’s results.

[5]:
design2 = mm.design_experiments(design_name='lhs_meta', n_samples=10_000)
[6]:
with timing_log("apply metamodel"):
    results2 = mm.run_experiments(design2)
<TIME BEGINS> apply metamodel
< TIME ENDS > apply metamodel <0.11s>
[7]:
results2.info()
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 20 columns):
 #   Column                      Non-Null Count  Dtype
---  ------                      --------------  -----
 0   alpha                       10000 non-null  float64
 1   amortization_period         10000 non-null  int64
 2   beta                        10000 non-null  float64
 3   debt_type                   10000 non-null  category
 4   expand_capacity             10000 non-null  float64
 5   input_flow                  10000 non-null  int64
 6   interest_rate               10000 non-null  float64
 7   interest_rate_lock          10000 non-null  bool
 8   unit_cost_expansion         10000 non-null  float64
 9   value_of_time               10000 non-null  float64
 10  yield_curve                 10000 non-null  float64
 11  free_flow_time              10000 non-null  int64
 12  initial_capacity            10000 non-null  int64
 13  no_build_travel_time        10000 non-null  float64
 14  build_travel_time           10000 non-null  float64
 15  time_savings                10000 non-null  float64
 16  value_of_time_savings       10000 non-null  float64
 17  net_benefits                10000 non-null  float64
 18  cost_of_capacity_expansion  10000 non-null  float64
 19  present_cost_expansion      10000 non-null  float64
dtypes: bool(1), category(1), float64(14), int64(4)
memory usage: 1.5 MB
[8]:
from emat.analysis import contrast_experiments
contrast_experiments(mm.scope, results2, results)

No Build Time

Build Time

Time Savings

Value Time Save

Net Benefits

Cost of Expand

Present Cost

Partial Metamodels

It may be desirable in some cases to construct a partial metamodel, covering only a subset of the performance measures. This is likely to be particularly desirable if a large number of performance measures are included in the scope, but only a few are of particular interest for a given analysis. The time required for generating and using meta-models is linear in the number of performance measures included, so if you have 100 performance measures but you are only presently interested in 5, your meta-model can be created much faster if you only include the 5 performance measures. It will also run much faster, but the run time for metamodels is so small anyhow, it’s likely you won’t notice.

To create a partial meta-model for a curated set of performance measures, you can use the include_measures argument of the create_metamodel command.

[9]:
with timing_log("create limited metamodel"):
    mm2 = create_metamodel(
        scope, results,
        include_measures=['time_savings', 'present_cost_expansion'],
        suppress_converge_warnings=True,
    )

with timing_log("crossvalidate limited metamodel"):
    display(mm2.cross_val_scores())

with timing_log("apply limited metamodel"):
    results2_limited = mm2.run_experiments(design2)

results2_limited.info()
<TIME BEGINS> create limited metamodel
< TIME ENDS > create limited metamodel <3.33s>
<TIME BEGINS> crossvalidate limited metamodel
Cross Validation Score
time_savings 0.8559
present_cost_expansion 0.9297
< TIME ENDS > crossvalidate limited metamodel <5.47s>
<TIME BEGINS> apply limited metamodel
< TIME ENDS > apply limited metamodel <0.05s>
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 15 columns):
 #   Column                  Non-Null Count  Dtype
---  ------                  --------------  -----
 0   alpha                   10000 non-null  float64
 1   amortization_period     10000 non-null  int64
 2   beta                    10000 non-null  float64
 3   debt_type               10000 non-null  category
 4   expand_capacity         10000 non-null  float64
 5   input_flow              10000 non-null  int64
 6   interest_rate           10000 non-null  float64
 7   interest_rate_lock      10000 non-null  bool
 8   unit_cost_expansion     10000 non-null  float64
 9   value_of_time           10000 non-null  float64
 10  yield_curve             10000 non-null  float64
 11  free_flow_time          10000 non-null  int64
 12  initial_capacity        10000 non-null  int64
 13  time_savings            10000 non-null  float64
 14  present_cost_expansion  10000 non-null  float64
dtypes: bool(1), category(1), float64(9), int64(4)
memory usage: 1.1 MB

There is also an exclude_measures argument for the create_metamodel command, which will retain all of the scoped performance measures except the enumerated list. This can be handy for dropping a few measures that are not working well, either because the data is bad in some way or if the measure isn’t well fitted using the metamodel.

[10]:
with timing_log("create limited metamodel"):
    mm3 = create_metamodel(
        scope, results,
        exclude_measures=['net_benefits'],
        suppress_converge_warnings=True,
    )

with timing_log("crossvalidate limited metamodel"):
    display(mm3.cross_val_scores())

with timing_log("apply limited metamodel"):
    results3_limited = mm3.run_experiments(design2)

results3_limited.info()
<TIME BEGINS> create limited metamodel
< TIME ENDS > create limited metamodel <10.62s>
<TIME BEGINS> crossvalidate limited metamodel
Cross Validation Score
no_build_travel_time 0.9807
build_travel_time 0.9736
time_savings 0.8864
value_of_time_savings 0.8363
cost_of_capacity_expansion 0.8896
present_cost_expansion 0.9454
< TIME ENDS > crossvalidate limited metamodel <11.09s>
<TIME BEGINS> apply limited metamodel
< TIME ENDS > apply limited metamodel <0.10s>
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 0 to 9999
Data columns (total 19 columns):
 #   Column                      Non-Null Count  Dtype
---  ------                      --------------  -----
 0   alpha                       10000 non-null  float64
 1   amortization_period         10000 non-null  int64
 2   beta                        10000 non-null  float64
 3   debt_type                   10000 non-null  category
 4   expand_capacity             10000 non-null  float64
 5   input_flow                  10000 non-null  int64
 6   interest_rate               10000 non-null  float64
 7   interest_rate_lock          10000 non-null  bool
 8   unit_cost_expansion         10000 non-null  float64
 9   value_of_time               10000 non-null  float64
 10  yield_curve                 10000 non-null  float64
 11  free_flow_time              10000 non-null  int64
 12  initial_capacity            10000 non-null  int64
 13  no_build_travel_time        10000 non-null  float64
 14  build_travel_time           10000 non-null  float64
 15  time_savings                10000 non-null  float64
 16  value_of_time_savings       10000 non-null  float64
 17  cost_of_capacity_expansion  10000 non-null  float64
 18  present_cost_expansion      10000 non-null  float64
dtypes: bool(1), category(1), float64(13), int64(4)
memory usage: 1.4 MB